100 research outputs found

    Empirical Bounds on Linear Regions of Deep Rectifier Networks

    Full text link
    We can compare the expressiveness of neural networks that use rectified linear units (ReLUs) by the number of linear regions, which reflect the number of pieces of the piecewise linear functions modeled by such networks. However, enumerating these regions is prohibitive and the known analytical bounds are identical for networks with same dimensions. In this work, we approximate the number of linear regions through empirical bounds based on features of the trained network and probabilistic inference. Our first contribution is a method to sample the activation patterns defined by ReLUs using universal hash functions. This method is based on a Mixed-Integer Linear Programming (MILP) formulation of the network and an algorithm for probabilistic lower bounds of MILP solution sets that we call MIPBound, which is considerably faster than exact counting and reaches values in similar orders of magnitude. Our second contribution is a tighter activation-based bound for the maximum number of linear regions, which is particularly stronger in networks with narrow layers. Combined, these bounds yield a fast proxy for the number of linear regions of a deep neural network.Comment: AAAI 202

    When Lift-and-Project Cuts are Different

    Get PDF
    In this paper, we present a method to determine if a lift-and-project cut for a mixed-integer linear program is irregular, in which case the cut is not equivalent to any intersection cut from the bases of the linear relaxation. This is an important question due to the intense research activity for the past decade on cuts from multiple rows of simplex tableau as well as on lift-and-project cuts from non-split disjunctions. While it is known since Balas and Perregaard (2003) that lift-and-project cuts from split disjunctions are always equivalent to intersection cuts and consequently to such multi-row cuts, Balas and Kis (2016) have recently shown that there is a necessary and sufficient condition in the case of arbitrary disjunctions: a lift-and-project cut is regular if, and only if, it corresponds to a regular basic solution of the Cut Generating Linear Program (CGLP). This paper has four contributions. First, we state a result that simplifies the verification of regularity for basic CGLP solutions from Balas and Kis (2016). Second, we provide a mixed-integer formulation that checks whether there is a regular CGLP solution for a given cut that is regular in a broader sense, which also encompasses irregular cuts that are implied by the regular cut closure. Third, we describe a numerical procedure based on such formulation that identifies irregular lift-and-project cuts. Finally, we use this method to evaluate how often lift-and-project cuts from simple tt-branch split disjunctions are irregular, and thus not equivalent to multi-row cuts, on 74 instances of the MIPLIB benchmarks.Comment: INFORMS Journal on Computing (to appear

    Enumerative Branching with Less Repetition

    Get PDF
    We can compactly represent large sets of solutions for problems with discrete decision variables by using decision diagrams. With them, we can efficiently identify optimal solutions for different objective functions. In fact, a decision diagram naturally arises from the branch-and-bound tree that we could use to enumerate these solutions if we merge nodes from which the same solutions are obtained on the remaining variables. However, we would like to avoid the repetitive work of finding the same solutions from branching on different nodes at the same level of that tree. Instead, we would like to explore just one of these equivalent nodes and then infer that the same solutions would have been found if we explored other nodes. In this work, we show how to identify such equivalences—and thus directly construct a reduced decision diagram—in integer programs where the left-hand sides of all constraints consist of additively separable functions. First, we extend an existing result regarding problems with a single linear constraint and integer coefficients. Second, we show necessary conditions with which we can isolate a single explored node as the only candidate to be equivalent to each unexplored node in problems with multiple constraints. Third, we present a sufficient condition that confirms if such a pair of nodes is indeed equivalent, and we demonstrate how to induce that condition through preprocessing. Finally, we report computational results on integer linear programming problems from the MIPLIB benchmark. Our approach often constructs smaller decision diagrams faster and with less branching

    Reformulating the Disjunctive Cut Generating Linear Program

    Get PDF
    Lift-and-project cuts can be obtained by defining an elegant optimization problem over the space of valid inequalities, the cut generating linear program (CGLP). A CGLP has two main ingredients: (i) an objective function, which invariably maximizes the violation with respect to a fractional solution x to be separated; and (ii) a normalization constraint, which limits the scale in which cuts are represented. One would expect that CGLP optima entail the best cuts, but the normalization may distort how cuts are compared, and the cutting plane may not be a supporting hyperplane with respect to the closure of valid inequalities from the CGLP. This work proposes the reverse polar CGLP (RP-CGLP), which switches the roles conventionally played by objective and normalization: violation with respect to x is fixed to a positive constant, whereas we minimize the slack for a point p that cannot be separated by the valid inequalities. Cuts from RP-CGLP optima define supporting hyperplanes of the immediate closure. When that closure is full-dimensional, the face defined by the cut lays on facets first intersected by a ray from x to p, all of which corresponding to cutting planes from RP-CGLP optima if p is an interior point. In fact, these are the cuts minimizing a ratio between the slack for p and the violation for x. We show how to derive such cuts directly from the simplex tableau in the case of split disjunctions and report experiments on adapting the CglLandP cut generator library for the RP-CGLP formulation

    Seamless Benchmarking of Mathematical Optimization Problems and Metadata Extensions

    Get PDF
    Public libraries of problems such as Mixed Integer Programming Library (MIPLIB) are fundamental to creating a common benchmark for measuring algorithmic advances across mathematical optimization solvers. They also often provide metadata on problem structure, hardness with respect to state-of-the-art solvers, and solutions with the best objective function value on record. In this short paper, we discuss some ways in which such metadata can be leveraged to create a seamless testing experience. In particular, we present MIPLIBing: a Python library that automatically downloads queried subsets from the current versions of MIPLIB, MINLPLib, and QPLIB, provides a centralized local cache across projects, and tracks the best solution values and bounds on record for each problem. While inspired by similar use cases from other areas, we reflect on the specific needs of mathematical optimization and discuss opportunities to extend benchmark sets to facilitate experimentation with different model structures

    Estudo de correlação em mercados usando redes complexas

    Get PDF
    Monografia (graduação)—Universidade de Brasília, Faculdade de Economia, Administração e Contabilidade e Ciência da Informação e Documentação, Departamento de Economia, 2012.Neste trabalho investigamos as propriedades topológicas das redes bancárias e construímos a árvore geradora mínima (MST), que é baseada no conceito de ultrametricidade, utilizando a matriz de correlações para um grande número devariáveis bancárias. Os resultados empíricos sugerem que os bancos privados e estrangeiros tendem a formar grupos dentro da rede e, além disso, os bancos com diferentes tamanhos são também fortemente ligados entre si e tendem a formar aglomerados. Estes resultados são robustos ao uso de diferentes variáveis para a construção da rede, como a lucratividade dos bancos, ativos, patrimônio, receitas e empréstimos. Utilizamos também a metodologia da MST e sua árvore taxonômica para investigar as propriedades topológicas da estrutura a prazo da rede das taxas de juros brasileira, usando a matriz de correlação entre as taxas de juros de diferentes vencimentos. Nós mostramos que a taxa de juros de curto prazo é a mais importante dentro da rede das taxas de juros, que está em consonância com a hipótese de expectativas de taxas de juros e, além disso, descobrimos que a rede taxas de juros brasileira forma aglomerados por prazo de vencimento
    corecore